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TECHNICAL MEMORANDUM 


COMPARISON OF COMMUNICATION ARCHITECTURES 
FOR SPACECRAFT MODULAR AVIONICS SYSTEMS 

1. INTRODUCTION 


This Technical Memorandum (TM) is a survey of publicly available information concerning 
serial communication architectures used, or proposed to be used, in aeronautic and aerospace applica- 
tions. It is not intended to be all-inclusive, but rather, focuses on serial communication architectures 
that are suitable for low-latency or real-time communication between physically distributed nodes in a 
system. Candidates for the study have either extensive deployment in the field or appear to be viable for 
near-term deployment. 

The motivation for this survey is to provide a compilation of data suitable for trading serial bus 
architecture against requirements for modular distributed real-time avionics architecture for man-rated 
spacecraft. This survey is one of the products of the Propulsion High-Impact Avionics Technology 
(PHIAT) Project at NASA Marshall Space Flight Center (MSFC). PHIAT was originally funded under 
the Next Generation Launch Technology (NGLT) Program to develop avionics technologies for control 
of next generation reusable rocket engines. After the announcement of the Space Exploration Initiative, 
in January 2004, the Exploration Systems Mission Directorate (ESMD) through the Propulsion Technol- 
ogy and Integration Project at MSFC funded PHIAT. At this time the scope of the project was broad- 
ened to include vehicle systems control for human and robotic missions. Early in the PHIAT project, a 
survey was performed to determine the best communication architecture for a safety critical real-time 
distributed control system. This survey was focused only on those communication architectures specifi- 
cally targeted for safety critical systems. However, with the broadened scope of the PHIAT project and 
NASA’s increasing interest in implementing integrated system health management (ISHM), it became 
clear that an expanded view needed to be taken concerning communications between physically and 
functionally distributed systems. 

The project team reached the conclusion that one-size-fits-all communication architecture was 
unlikely to satisfy all the avionics architecture needs with the added functions required for ISHM. Com- 
munication architectures specifically targeted for hard real-time control generally do not provide the data 
throughput necessary for transporting and managing the large amounts of data that are expected for com- 
prehensive ISHM. On the other hand, communications architectures for high-speed, large- volume data 
transfer are generally not designed to provide the guaranteed low latency and high reliability required for 
safety critical, hard real-time control systems. Furthermore, most systems can be divided into a hierarchy 
of functions from safety critical (loss of function means loss of life and/or vehicle) to mission critical 
(loss of function means failure to meet mission goals) and through a descending range to those rated as 


low criticality that are used offline for vehicle maintenance decisions after the mission is over. Using 
one communications architecture to support all these functions would mean that some systems would 
not provide an adequate return on investment, while others could not perform in an optimal manner due 
to system limitations. This survey provides coverage of a range of communication architectures that can 
support many different tiers of critical functionality. The goal is to provide information that can be used 
to align communication architectures with the functionality needed to support modular avionics for the 
next generation spacecraft. 

In the context of this document, serial communication architectures are those that define a physi- 
cal layer, media access control, and possibly a protocol with data flow control and some level of error 
detection/correction. Such architectures are not just electrical specifications. Therefore, simpler serial 
buses such as RS-232, RS-485/422, and low-voltage differential signaling (LVDS) are not considered 
by themselves, but only when they specify the physical layer for a communication architecture. Serial 
communication standards, primarily for chip-to-chip or board-to-board communications, are not con- 
sidered because these are usually not suitable for long-haul communications and generally support only 
minimal media access control and protocols in typical applications. Examples of this type standard are 
serial peripheral interface (SPI) and inter-IC (I2C) bus. 
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2. CANDIDATE ARCHITECTURES 


The architectures selected have either extensive aerospace or aeronautic deployment history, 
are deployed in new vehicles, or have some potential to be included in future vehicles. If the net is 
cast widely, the number of serial communication architectures that exist is immense. There are sev- 
eral communication architectures used in industrial distributed control systems for factory and process 
automation. Additionally, there are communication architectures used to control the lighting, heating 
and elevator services in buildings. While these architectures are successful in their application field, the 
requirements for manned and robotic space vehicles differ significantly from those for industrial applica- 
tions, and much work may need to be done to convert such architectures for aerospace work. The goal 
of this study is to leverage off-the-shelf components as much as possible, and to minimize the changes 
needed to field the selected communication architectures. Communication architectures developed 
specifically for use in manned vehicle distributed control have a better chance of being ported to the 
aerospace environment unchanged. This may also apply to many of the more extensively used commu- 
nication architectures, such as Ethernet, as its wide use has spawned commercial interest in using it in 
manned vehicles. 

The communication architectures selected for study include event-triggered systems and time- 
triggered systems. Event-triggered communication refers to a system in which messages are generated 
based only on the need to transmit a new or changed piece of information, or to request that some infor- 
mation be transmitted to the requester. Ethernet is a good example of such a system. Used in communi- 
cation between computers (nodes) that are part of a network, either local or over the Internet, messages 
are sent over Ethernet when an individual at a network node decides to look at a Web page or send 
e-mail. For instance, transmitted messages are sent when the user enters a Web page address in a 
browser, and messages are received when the Web server (another node) sends back the requested con- 
tent. These messages are sent based on an event that can occur at any time, with no discemable regular- 
ity. Time-triggered communication occurs at specified times based on a globally agreed upon time base. 
Such communication is scheduled with the passage of time and each node that is part of the network is 
given a finite amount of time, or a slot, in which to transmit a message in each communication cycle. 

The sequence of message slots in the schedule is repeated over and over to create periodic message 
transmission slots for each node. Messages are sent by nodes that are part of a network at a predefined 
moment in time as referenced to a global time base. The time base is generated either on a clock refer- 
ence message sent by a network master node, or by combining clock messages from several nodes. The 
latter method is a masterless approach to generating a time base, and generally employs a fault-tolerant 
clock algorithm to produce clock corrections for each node in the network. This masterless approach to 
creating a global time base creates a masterless communication protocol in which the failure of any node 
does not prevent the other nodes from communicating with each other. 

When the PHIAT team began exploring available options for real-time communications in safety 
critical distributed control systems, the information available indicated a clear preference for commu- 
nication architectures with time-triggered protocols (TTPs) over those with event-driven protocols. 
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A report written in 2001 by Rushby gives a comparison of bus architectures targeted toward safety 
critical systems. 1 Rushby includes an extensive list of references that provide further insight into the 
capabilities of these systems. The following systems are reviewed in the report and all employ a TTP: 
SAFEbus™, Time-Triggered Protocol (TTP™/C), FlexRay™, and SPIDER™. These architectures, with 
the addition of time-triggered controller area network (TTCAN), are the primary architectures targeted 
for safety critical systems. TTPs are considered by many to be a requirement for safety critical distrib- 
uted control systems, because the bus loading is known and constant, the message latency and jitter 
are known and constant, and the time-triggered nature of the communication supports composability. 
Composability means that the nodes, which are part of the time-triggered communication network, have 
precisely defined communication interfaces that can be developed by different manufacturers and will 
be guaranteed to integrate into the communication network. These time-triggered networks also employ 
different methods for fault tolerance and the ability to detect communication and node failures. All of 
the known time-triggered communication architectures are included in the study with the exception of 
SPIDER, which is intended as a case study for DO-254 “ Design Assurance Guidance for Airborne Elec- 
tronic Hardware.” The purpose of case study DO-254 is not necessarily to create deployable hardware, 
but to gain experience in the lab with hardware adhering to the new guidance document. 2 As such, there 
is no hardware that can be purchased openly or procured from the system designer for implementation. 
SPIDER is therefore inappropriate for consideration at this time. More details are provided on these 
buses in sections 2.2 through 2.5. 

While time-triggered systems offer a great deal in terms of addressing safety and highly depend- 
able operation, there are tradeoffs made to attain the high level of reliability needed. It is true that 
time-triggered communication provides a well-defined sequence of messages that ensure maximum bus 
loading stays at a prescribed level with no contention between the nodes for access to the bus, which 
is very important in the proper verification and operation of a hard real-time control system. However, 
there is a significant amount of upfront design that must be done to create the message schedule model 
and coordinate it with the timing of tasks at each node that require the data, and as such, a strict con 
figuration of the system is imposed. This strict configuration does not allow the addition of new nodes 
or messages without redesigning the message and task schedule. Event-based systems have no such 
constraints; so new nodes with new message requirements can be added simply by attaching them to 
the physical layer. In some systems event-based communication may be more efficient, as the number 
of messages passed in a given amount of time may be sparse or the data payloads may be large. In the 
former case, the message slots in a time-triggered architecture (TTA) would still exist even if the nodes 
in the network had no new information to send. This means empty slots are taking network bandwidth 
that could be used to send larger messages. So, if large data payloads must be transmitted, a TTA may 
require splitting the data up into chunks transmitted over several transmission cycles. In some sys- 
tems, this may be unacceptable. For instance, when transmitting a video data stream, breaking up the 
data could lead to choppy motion, depending on the rate of the communication cycles, which would be 
annoying to a viewer. On the other hand, the video stream need not be hard real-time with guaranteed 
delivery. In most cases, a viewer can tolerate the occasional loss of a frame better than consistently 
choppy video. In this scenario, a high-speed, event-based system may be a better choice than time- 
triggered communication. These issues are part of the trade space that will be dictated by the functional- 
ity of the modules that are interconnected with the communication architecture. 
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High-speed, event-driven communication architectures are included in the survey to provide 
the system designer with the information needed to make a choice based on communication through- 
put, reliability, and real-time requirements for the distributed system being designed. Clearly, there will 
need to be other considerations than just criticality when designing a system to transmit and manage the 
expected large data load required for comprehensive ISHM. The following communication architectures 
that provide high-speed throughput are included in this survey: 

• Avionics Full-Duplex Switched (AFDX) Ethernet, currently in service on Airbus A380 

• Fibre Channel, used in the Joint Strike Fighter 

• Ethernet, operational in commercial aircraft and on the International Space Station , also frequently 
proposed for new spacecraft avionics 

• Gigabit Ethernet has not yet been deployed in an aeronautic or aerospace application, it uses Fibre 
Channel physical layer and proposed for use in military systems 

• IEEE1394-B, used in the Jet Propulsion Laboratory X2000 spacecraft distributed avionics 
architecture, to date has not flown 

• SpaceWire, utilized in robotic spacecraft missions by the European Space Agency (ESA) 
and NASA 

A description of each bus is provided in sections 2.6 through 2.11. 

One important point to make concerns the determinism of real-time communications. Many 
of the users and distributors of particular communication architectures call the communication over that 
medium real time and deterministic. There are two definitions for deterministic: (1) The term describing 
a system whose time evolution can be predicted exactly and (2) algorithms that may be part of a system 
whose correct next step depends only on the current state. For real-time communications only the first 
definition applies. Any communication architecture that uses arbitration cannot be deterministic in this 
sense, because minor variations in the timing of system functions will cause changes in which messages 
are arbitrated and transmitted at any given time in a particular communication cycle. So the messages 
transmitted will vary and not be exactly predictable. As a rule, TTPs do not use arbitration. However, 
some exceptions exist to provide time limited windows, or slots, for event-triggered messages. In this 
survey, only TTCAN and FlexRay specifically provide for arbitrated event-triggered message windows. 
MIL-STD-1553 is referred to as deterministic because it is a master-slave protocol. During normal 
fault-free operation, the master is in complete control of the message traffic on the bus. If specified mes- 
sages are sent by the master in a predefined order, then MIL-STD-1553 is deterministic with respect 
to time. For example, IEEE1394— B, Ethernet, and Fibre Channel all use arbitration to send messages in 
their standard form and cannot claim to be deterministic with respect to time unless modifications to the 
standard implementation are made. 

MIL-STD-1553 is included because it is the most widely deployed serial communication archi- 
tecture in military and aerospace applications. It will be present in such systems for some time to come 
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due to its reliability and long historical use. However, it is beyond its prime, and despite efforts to 
increase its speed and capabilities, it is expected to eventually be supplanted by newer communication 
architectures in future military and space vehicles. 

The salient features of the communication architectures selected are compared in appendix A, 
table 1. While such tables are a good way to compare summarized data at a glance, they do not always 
provide a means of describing the compared items well. A brief description of each of the candidate 
architectures is given in sections 2.1 through 2.11. 

2.1 MIL-STD-1553 

The aircraft internal time division command/response multiplex data bus is a military standard 
with the designation MIL-STD- 1553b. This revision was published in 1978 and the last change notice 
was published in 1996 and is publicly available. 3,4 MIL-STD-1553 represents one of the first communi- 
cation data bus standards for transmission of digital data between systems and subsystems over a com- 
mon set of wires. The first users of the original version A, published in 1975, were the U.S. Air Force’s 
F-16 and the Army’s AH-64A Apache helicopter. 5 MIL-STD-1553 has found many applications 
including satellites, the Space Shuttle and the International Space Station. 

The standard defines a redundant serial communication bus that is used to interconnect nodes 
on a network and is commonly implemented in a dual redundant configuration. The transmission media 
is a twisted shielded pair consisting of the main bus and numerous stubs to create a multidrop topol- 
ogy. There is currently no maximum bus length defined in the standard, and working systems with a 
main bus length of several hundred meters have been implemented. However, it is highly recommended 
that the bus topology be built and tested prior to deployment to ensure proper performance. Time divi- 
sion multiple access (TDMA) allows communication between the interconnected nodes, while a single 
node designated as the bus controller (BC) supervises the bus access. The remaining nodes are remote 
terminals (RTs). They do not use a global clock and are only allowed to transmit data on the bus after 
it is requested, or commanded, by the BC. Commands from the BC may be asynchronous or they may 
follow a periodic pattern based on local timing at the BC. Nodes, acting as backup BC’s, may exist on 
the network to take over in the event of the primary BC failure. In a dual redundant configuration, data is 
not transmitted over redundant buses simultaneously, but rather one bus is used to transmit data for com- 
munication during normal operation and the other is in hot backup status used only to send commands in 
the event of node failure causing the primary bus to be monopolized by one node. The BC would send 
a transmitter shutdown message on the backup bus in an attempt to stop the node from babbling on the 
primary bus. The secondary bus could also be used to resume normal communication in the event the 
primary bus fails entirely due to physical damage. 

Communication over the bus is limited to 1 MB/s, which is very slow if message data contains 
more than a few bytes of data. Recently, the development of new standards called enhanced bit rate 
1553 (EBR-1553) and the miniature munitions/store interface (MMSI) have increased the speed to 
10 MB/s. They require a star, or hub, topology to provide the higher data rate, and therefore require 
additional components to implement the architecture. 6 Additionally, there are reports that two companies 
are working for the Air Force on a new transmission standard using existing MIL-STD-1553 cabling. 
The idea is to overlay high-speed communication without disturbing the existing legacy communication. 
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Laboratory prototypes reaching 200 MB/s have been reported. 7 Recently, change notice five has been 
released and incorporates the changes to the standard to support what is called Extended 1553orE1553. 
This change notice is not freely available to the public at this time. It is notable that the high-speed 
communication is separate from the legacy 1 MB/s communication, so the new systems will not com- 
municate with legacy systems at the high rate. These standards are relatively new; therefore, components 
based on them do not have substantial deployment at this time. This is likely to change in the near future 
since MIL-STD-1553B components are in wide use and components based on the new standards should 
provide an upgrade path with existing software reuse. These standards are not included in this trade 
study due to the lack of publicly available standards documents and their current limited use. 

MIL-STD-1553 also served as the basis for a fiber optic version called MIL-STD-1773. 

This standard still only provided for 1 MB/s and has not enjoyed wide use. A new standard called 
AS 1773 provides for 20 MB/s, but still has not been popular communication architecture in military 
and aerospace systems. 

Systems based on MIL-STD-1553 are considered to be extremely reliable and have been widely 
used in military and space applications. However, the need to transmit larger amounts of data at near 
real-time rates has led many designers of new military avionics to pursue other communication architec- 
tures. The cost of components is also high relative to components used in commercial communication 
architectures, such as Ethernet, due to the niche market that is targeted by suppliers of MIL-STD-1553 
components. The information in this section is a very brief overview. Complete details can be found in 
the standard and in manufacturer component and test equipment publications. 

2.2 SAFEbus 

SAFEbus is the registered trademark for the Honeywell implementation of ARINC 659 and is, 
by definition, the backplane bus in a computing cluster housed in a cabinet. It is currently part of the 
Boeing 777 avionics architecture. Communication with other cabinets and control and monitoring 
subsystems is achieved through input/output (10) modules using other bus protocols. This architecture 
requires a quad redundant bus, in which two data lines and one clock line comprise each bus. Full dupli- 
cation of bus interface units (BIUs) is provided at each of eight nodes (four processing nodes and four 
10 nodes) providing a powerful but expensive architecture. The standard defines the capability to have 
shadow nodes waiting in hot backup to take over if the primary node fails. SAFEbus has limited bus 
length, but has a transmission rate of 60 MB/s. 

The level of reliability and redundancy provided by SAFEbus is extremely high, as it was specif- 
ically designed to support safety critical functions in commercial passenger aircraft. Most of the func- 
tionality is in the BIUs that perform clock synchronization and control data transmission based on 
message schedules. Each node has a pair of BIUs that drive different pairs of bus lines, but can read 
all four lines. BIUs act as the bus guardian for their partner BIUs by monitoring transmitted data and 
transmission scheduling and controlling its partner’s access to the bus lines. This prevents a faulty BIU 
from becoming a babbling idiot or transmitting erroneous data. Data transmission is time-triggered and 
is governed by a message schedule. Synchronized timing of messages delivered is maintained using a 
global clock. The clocks are synched via periodic pulses on the dedicated clock line. Because the mes- 
sage schedules include sender and recipient information, the data packets include no header information, 
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but are pure data. There is also no cyclic redundancy check (CRC) or parity information transmitted with 
the data because the BIU pairs check all data transmitted on the bus signal pairs by the node they sup- 
port. Each BIU checks its data and its partner’s data for errors. These features result in a very efficient, 
masterless transmission protocol. 

A system that is designed to be fault tolerant should have a fault hypothesis by which its perfor- 
mance can be evaluated. The fault hypothesis for the SAFEbus architecture states that it is guaranteed 
to tolerate one arbitrary fault, but may tolerate multiple faults. At most, one component of any pair can 
fail (i.e. the BIU, the processing module, or one of the dual redundant bus lines). When one component 
of a node fails, the node must fail-silent, thus removing itself from operation. Nodes with important 
functions must be redundant to be able to continue normal operation. 

The SAFEbus architecture is considered to be very dependable for safety critical functions, but 
it is also very expensive. The hardware is redundant as a pair of pairs at all levels and the components 
are proprietary to Honeywell. The components are not available as commercial off-the-shelf products. 
Despite the creation of the ARINC 659 standard, it does not appear that other independent companies 
have created ARINC 659 compliant components. More information on SAFEbus can be found in the 
standard and in papers and reports written on the subject. 7 ’ 1 ’ 8 

2.3 Time-Triggered Communication Protocol 

The TTA developed at the University of Vienna uses a time-triggered communication protocol 
called TTP/C. Specifications for TTP/C were first published in 1993. 9 The C in TTP/C stands for auto- 
motive class C referring to the hard real-time communications requirement. Indeed, the automobile 
industry funded much of the TTA development and the TTP/C protocol to support future drive-by-wire 
applications. TTTech, a company based in Austria, has commercialized TTP/C and the communication 
controller integrated circuit devices are now available for purchase. These devices implement the pro- 
tocol in hardware and are openly available to any system developer. TTP/C has been applied to a wide 
variety of manned transportation vehicles including the Airbus A3 80 cabin pressure control system, 
full- authority digital engine controllers for military aircraft, and railway signaling and switching sys- 
tems in Switzerland, Austria, and Hungary. It has also been used in drive-by-wire concept cars. TTP/C is 
designed to provide a high level of reliability and availability at a cost suitable for mass production. 

TTP/C is a fault-tolerant TTP providing important services such as autonomous message trans- 
port based on a schedule with known delay and bounded jitter over dual redundant communication chan- 
nels. TTA, and therefore TTP/C, supports the implementation of redundant nodes or redundant functions 
executing on multiple nodes. Current implementation of the communication controller chip includes a 
fault-tolerant global clock to establish a time base, membership services to inform all nodes of the health 
status of the other nodes, and message status set by both the sender and the receiver. The protocol is 
masterless, which allows communication to continue between the remaining nodes on the network when 
other nodes fail. Bus guardians are included in the TTP/C communication controller hardware, but are 
part of the same device and share a common clock. TTP/C is designed to be physical layer independent. 
Current controller chips support communication at 5 MB/s over RS-485 and 25 MB/s over the Ether- 
net physical layer. There is reported to be an effort to develop a 1 GB/s implementation using Gigabit 
Ethernet as the physical layer. The TTP/C fault hypothesis guarantees that the communication system 
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can tolerate any single fault in any component of the architecture. It can tolerate multiple faults depend- 
ing on the application. More information on TTP/C can be found in the specification . 10 The specification 
document is available free upon request from TTTech. 

2.4 FlexRay 

The FlexRay protocol is specifically designed to address the needs of a dependable automotive 
network for applications like drive-by-wire, brake-by-wire, and power train control. It is designed to 
support communication over single or redundant twisted pairs of copper wire. It includes synchronous 
frames and may include asynchronous communication frames in a single communication cycle. The syn- 
chronous communication frames are transmitted during the static segment of a communication cycle. 

All slots are the same length and are repeated in the same order every communication cycle. Each node 
is provided one or more slots whose position in the order is determined at design time. Each node inter- 
face is provided only with the information concerning its time to send messages in this segment and 
must count slots on each communication channel. After this segment, the dynamic segment begins with 
the time divided into minislots. At the beginning of each minislot there is the opportunity to send a mes- 
sage, if one is sent the minislot expands into the message frame. If a message is not sent the minislot 
elapses as a short idle period. Messages are arbitrated in this segment by sending the message with the 
lowest message ID. It is not required that messages be sent over both communication channels when 
a redundant channel exists. 

No membership services are provided by FlexRay to detect faulty nodes. Clock synchronization, 
through messages sent by specific nodes, is the only service provided. There is also no bus guardian 
specification currently published and no published fault hypothesis. The FlexRay consortium, consisting 
of many major automotive companies, has indicated it has no interest in any field of application other 
than the automotive industry. The hardware that has been developed is only available to the consortium 
members and cannot be purchased by nonmembers. Only recently have the protocol and physical layer 
specifications been publicly available . 11 ’ 12 FlexRay is included because it has the potential to be applied 
to aerospace applications, despite the current lack of interest by the consortium. 

2.5 Time-Triggered Controller Area Network 

The TTCAN specification is an extension to the standard controller area net (CAN) to provide 
time-triggered communication. Standard CAN uses carrier sense multiple access with collision detection 
and arbitration on message priority (CSMA/CD+AMP) for message arbitration. Simply stated, when 
there is an attempt by two nodes to send a message simultaneously, the message with the lowest ID 
number is transmitted. Additionally, standard CAN controllers will retransmit a message when no 
acknowledgement is received. 

TTCAN can be implemented in software or hardware to use a system matrix that defines a 
schedule for message transmission over a communication cycle. This schedule includes slots for specific 
messages that are sent every cycle and slots for standard arbitration, so event-triggered messages can 
be transmitted. TTCAN still uses CSMA/CD+AMP, as implemented in standard CAN controllers, to 
ensure proper arbitration during the arbitrated frames. During the scheduled frames there should be 
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no bus contention, and the arbitration service will not be used. TTCAN can only be implemented 
on CAN controllers with the capability to turn off the retransmit feature. 

Clock synchronization is achieved by designating one node as the time master. This node sends 
a reference frame to begin the communication cycle. The maximum transmission rate is 1 MB/s but is 
typically lower in application, on the order of 500-650 Kbits/s. TTCAN is targeted to the automotive 
industry, but CAN has found applications in industrial automation and some military systems. So it is 
included for its potential to be used in aerospace applications. TTCAN is specified by the international 
standard ISO 11898-4 “Time-Triggered Communication on CAN.” There is also information in papers 
published on the subject. 13 ’ 14 


2.6 IEEE 1394b 

IEEE 1394 (Firewire) is a communication architecture that has generated much interest in aero- 
space applications, as evidenced by the Jet Propulsion Laboratory’s use of the legacy IEEE 1394-1995 
in the X2000 fault-tolerant avionics system for the Deep-Space System Technology Program. 15 Interest 
in IEEE 1394 for space applications stems from the fast communication rates over copper wiring, and 
the availability of intellectual property (IP) cores for use in the fabrication of application specific inte- 
grated circuit (ASIC) devices. This survey covers IEEE 1394b that supports data rates from 100 MB/s 
up to 3.2 GB/s and also supports the specifications in the legacy standards. Communication is specified 
over twisted, shielded and unshielded, pairs as well as plastic and glass optical fiber. The transmission 
medium and the length of the medium affects the maximum transmission rate. 16 

The communication protocol used is characterized by an isochronous transmission phase and 
an asynchronous transmission phase. Isochronous transmission refers to broadcast transmissions to 
one node or many nodes on the network without error correction or retransmission. This is useful for 
video data where loss of a frame now and then is acceptable, but choppy error-free video is not desir- 
able. Asynchronous transfers are targeted to a specific address (another node) on the network and are 
acknowledged by the recipient, allowing error checking and the retransmission of messages. This is 
used for data that must be transmitted error free. Arbitration for bus access occurs for each transmission 
phase. IEEE 1394b speeds up the arbitration process by using bidirectional communication in which the 
arbitration frames are sent while data frames are being sent. 

IEEE 1394 uses point-to-point connections in a tree topology and does not support loops. How- 
ever, there exists the capability to disable ports, so a loop may be connected, and in the event a link fails 
the disabled port can be enabled to reestablish connectivity with all the nodes. At start up, an identifica- 
tion process is used to provide addresses to the nodes, select root nodes, and isochronous master. Adding 
or removing devices requires the identification process to execute again. The family of standards speci- 
fying the legacy architecture of IEEE 1394-1995, IEEE 1394a, and the updated architecture IEEE 1394b 
are available for purchase from IEEE. 


2.7 SpaceWire 

SpaceWire, developed in Europe for use in satellites and spacecraft, is based on two existing 
standards— IEEE 1355 and LVDS. 17,18 It has found application on the NASA’s Swift spacecraft and 
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several ESA spacecraft such as Rosetta, and has been proposed for use on the James Webb Telescope. 
The European Cooperation for Space Standardization has published a SpaceWire specification. 19 

The transmission physical layer is shielded twisted pair and point-to-point. A large network of 
devices can be created using cascades of hubs or switches that route messages from one node to another. 
This requires the message packet to contain address or routing information that is used by the hubs and 
switches to send the data to the recipient. The standard does not specify the arbitration schemes that 
will be needed at the hubs and switches. It does however establish the concept of port credit to regulate 
message flow across a link. Senders must not exceed the data buffering capacity of a port. Buffer space 
availability is tracked by flow control tokens. The SpaceWire specification indicates the maximum data 
transfer rate is 400 MB/s. Data transmission is event triggered in this architecture. 

2.8 Ethernet 10/100 Base-T 

Ethernet is one of the most widely used communications architectures for computer networks at 
business, government, and educational institutions. It has also found military and aerospace application, 
and is currently used on the International Space Station. The 802.3-2002 IEEE Standard defines Ether- 
net while the current revision of this standard includes specifications for Gigabit Ethernet. Because the 
10/100 Base-T implementation of Ethernet and the 1/10 G Base-X implementation have some significant 
differences, Gigabit Ethernet is described separately in section 2.11. 

As the designation suggests, 10/100 Base-T Internet provides data transmission rates of 10 MB/s 
and 100 MB/s over unshielded twisted pair. Ethernet can operate in half-duplex mode (all nodes share 
the same cables) or full-duplex mode (nodes can communicate over dedicated cabling with one other 
device). In half-duplex operation CSMA/CD governs the way computers share the channel. This works 
by only initiating data transmission when the line is idle. If two nodes initiate transmission at the same 
time, a collision is detected and transmission ceases. Each node then waits until the line is idle, and then 
waits a random amount of time to begin transmitting again. The two nodes will hopefully select different 
random wait times and gain access to the bus. Clearly, this can result in extremely inefficient communi- 
cation, especially when data traffic is heavy. Full-duplex mode is possible when the nodes are connected 
to a switch that allows a dedicated connection between the switch port and the node. The switch is now 
responsible for routing the message to the intended recipient without contention. 

The protocol used to send messages affects the reliability of the transmission, the overhead in 
the message packet, and the time required to complete a message transaction. Two popular protocols 
are: (1) User datagram protocol (UDP) and (2) transmission control protocol (TCP). UDP is an unreli- 
able connectionless protocol with no guarantee that the data will reach its destination. It is meant to 
provide barebones service with very little overhead. TCP adds significant overhead to the transmission 
process, when compared to UDP, but it provides a reliable connection that requires the sender (client) 
and receiver (server) to open a connection before sending data, ensures messages are received properly, 
sequences packets for transmission, and provides flow control. IEEE Standard 802.3-2002 is the most 
recent revision of the standard specifying Ethernet. 20 
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2.9 Avionics Full-Duplex Switched Ethernet 


AFDX Ethernet is a trademark of Airbus. It was developed for use in the A380 passenger plane. 

It is a standard that defines the electrical and protocol specifications for the exchange of data between 
avionic subsystems using IEEE 802.3 (100 Base-TX) for the communications architecture. The ARINC 
664, Part 7 standard builds on the proprietary standard developed by Airbus. The AFDX communica- 
tion protocol has been derived from commercial databus standards (IEEE 802.3 Ethernet medium 
access control (MAC) addressing, IP, and UDP) and adds deterministic timing and redundancy manage- 
ment with the goal of providing secure and reliable communications of critical and noncritical data. 

It capitalizes on the huge commercial investment and advancements in Ethernet. 

The issue of deterministic communications is addressed by defining communication virtual links 
(VLs) between end systems with specified maximum bandwidth, bounded latency, and frame size dur- 
ing system design. These VLs must share the 100 MB/s physical link. The switches are provided with 
a configuration table that defines the network configuration. Queues at each port and switches used to 
route the messages may introduce jitter in the message latency, or receive time of the message. This jitter 
is due to random delays in transmission based on the message transmission volume at a given time, and 
is required to be less than 500 |is at the output of the transmitting end system. This jitter bound does not 
include jitter due to swiches or at the receiver. Messages on VL are sent with a sequence number that is 
used on the receiving end to verify that the sequence numbers within a VL are in order. This is referred 
to as integrity checking. 

A redundant set of switches and physical links is required by the AFDX standard. Data is rep- 
licated and passing on the first valid message received on one channel and discarding the duplicate 
provides redundancy management. The redundancy management function may also introduce message- 
timing jitter that is included in the overall transmission jitter requirement of less than 500 |is. AFDX 
provides message error detection and the capability for switches to enter quiet mode in the event of cata- 
strophic failures within the switch. Node failures resulting in inappropriate messages cause the switch to 
discard the messages. No mechanism is specified to inform the receiving node of sending node errors. 
AFDX has no published fault hypothesis. More information concerning AFDX is found in the ARINC 
standard. 21 ’ 22 


2.10 Fibre Channel 

As specified by a large collection of standards published by the American National Standards 
Institute (ANSI), Fibre Channel is designed to be a high-performance data transport connection technol- 
ogy supporting transmission via copper wires or fiber optic cables over long distances. It is designed 
to support a variety of upper level protocols mapped onto the physical delivery service. Fibre Channel 
was originally developed for storage applications and is primarily used to implement storage area nets 
(SAN). It has been selected for use in military aircraft avionics, most notably the F/A-18 Hornet Fighter- 
Bomber avionics upgrades and the Joint Strike Fighter. One ANSI standard addresses the application 
of Fibre Channel to the avionics environment. 

Fibre Channel is a full-duplex communication architecture that supports a variety of topologies 
such as point-to-point, arbitrated loop, and switched fabric. The switched fabric topology is used in 
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the Joint Strike fighter. The switches must keep track of address information to send messages from one 
node to another. Fibre Channel supports several classes of transmission as follows: 

• Class 1— Provides a dedicated connection with acknowledgment, guaranteeing delivery and message 

sequence, 

• Class 2— Connectionless and may provide messages out of order, delivery confirmation is provided, 

• Class 3— Connectionless and unconfirmed. Flow control is provided based on port credit, similar 

to SpaceWire. Data is only sent when the credit counter indicates buffer space is available. 

While Fibre Channel is extremely fast, it is not deterministic in its standard form. Delays 
through switches increase as network traffic increases. With large network sizes, it is impossible to 
analyze these delays, as they are functions of multiple variables. 23 Fibre Channel also has many char- 
acteristics that make it attractive, including the availability of off-the-shelf components, capability for 
plug-and-play, and support of hot-swappable components. To address the determinism issue, the Fibre 
Channel avionics environment (FC-AE) working group developed standards pertaining to upper-level 
protocols with the goal of augmenting Fibre Channel to provide deterministic latency. Of particular 
interest is FC-AE-1553, that involves creating a deterministic command/response protocol that can 
leverage existing system designs based on MIL-STD-1553, but make full use of the Fibre Channel 
characteristics. The comparison table entries are primarily for the switched topology implementation 
of Fibre Channel and the standard characteristics. The FC-AE related standards are not included in this 
TM because coverage of all upper-level protocols that could run on Fibre Channel is outside the scope 
of this survey. The numerous standards that specify Fibre Channel are also not referenced in this survey. 
More information can be found at www.tll.org and the standards in their final published form may be 
purchased from ANSI. 


2.11 Gigabit Ethernet 

1000/10 G Base-X Ethernet is included as a separate section because it is a combination of the 
IEEE 802.3 standard and the Fibre Channel physical layer standards. It is widely used in networks for 
commercial, government, military, and educational institution networks and typically uses TCP/IP or 
UDP/IP, as is done with Ethernet. It supports both copper wire and fiber optic transmission media. 

The transmission rate is very fast and it can be implemented over long distances (40 km is reported). 
The maximum length of the transmission medium is determined by the medium itself. 10 G Ethernet 
only supports full-duplex operation, while 1 G Ethernet will support half-duplex transmission. Other 
than the differences in the physical layer, 1000/10 G Base-X operates the same as 10/100 Base-T. Like 
Fibre Channel, there is much interest in implementing Ethernet in military systems, however; no pub- 
licly available information exists on any deployment in military systems. IEEE Standard 802.3ae-2002 
specifies 1000/10 G Base-X Ethernet. 24 
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3. SELECTION RATIONALE 


Early in the project, the PHIAT team needed to select the communication architecture to sup- 
port a hard real-time distributed control system for safety critical systems in a manned spacecraft. These 
systems include propulsion, spacecraft navigation and attitude control, automated docking, vehicle 
health management, and life support. Based on requirements developed by the PHIAT team, the result- 
ing distributed system had to support fault detection, containment, and tolerance while providing high 
reliability and high availability. Additionally, the system must employ modular components at all levels 
for high reusability, flexibility, and scalability, and these components must support plug-and-play and be 
hot swappable wherever possible. Also required was the capability to distribute functionality and intel- 
ligence to enable the use of existing radiation-hardened processors and provide complex functionality 
for fault detection, isolation, and recovery (FDIR) and health monitoring. Finally, the system must be 
sustainable with respect to nonrecurring engineering, upgrade, and maintenance costs. The capability 
to transmit large amounts of data at an extremely high rate was not a requirement. Most control loops 
operate at a rate of 100 Hz or less. The SSME controller operates at a rate of 50 Hz and the flight control 
loop in the Space Shuttle general purpose computers executes at 25 Hz. 25 ’ 26 

These requirements were best met by TTP/C for several reasons. TTP/C is designed specifically 
for safety critical, hard real-time distributed control. As such, it provides the guaranteed latency and 
jitter that is needed to ensure that the data required for distributed control functions is delivered in a 
timely and predictable fashion. The use of a predefined message schedule with a fault-tolerant global 
clock provides known and exactly predictable communication bus loading and message sequencing. 
Most importantly, the protocol is masterless. The failure of a single node, or even several nodes, does 
not prevent synchronized communication from continuing between the remaining nodes. Fault detection, 
containment, and tolerance are provided via the membership services, message status, data consistency 
checks, and bus guardian functions implemented in the hardware. TTP/C imposes a physically and func- 
tionally distributed architecture that partitions the application hardware and the communication network. 
This not only prevents application errors from propagating from one node to another, but also simplifies 
software development due to the implementation of protocol components in the hardware. The commu- 
nication network looks like shared memory to the application software on each node. All that is required 
for communication is periodic reading from and writing to the memory locations. 

TTP/C supports hot swap of nodes on the network. Faulty nodes can be replaced and new nodes 
integrated without powering down the rest of the system. This along with the strict interface specifica- 
tion supports modularity in system upgrades and new system integration. Modules can be upgraded 
and swapped with existing modules without disturbing the system and without full-system requalifica- 
tion. The strict interface definition allows different manufacturers to create modules and essentially 
guarantees successful integration if the interface definitions are enforced. 

The communication rates supported by TTP/C hardware currently available are suitable for 
the real-time control requirements of all safety critical vehicle subsystems. Higher data rates are only 
needed if noncritical data is transmitted along with critical data. From a control system standpoint, there 
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is no need to transmit data like a video stream or vibration data streams from multiple channels. Rather, 
this data would be transmitted directly to a local processing node that would then transmit the analyzed 
results obtained from this data to the components that need it. In the case of a video stream for auto- 
mated docking, the information transmitted across the hard real-time network would be the coordinates 
of the target that are needed by the controller for the reaction control system. Since TTP/C is designed 
to be physical layer independent, higher speed transmission can be obtained by moving to an appropriate 
physical layer, if the need arises. 

Finally, TTP/C represents a cost-effective solution. The communication controller and supporting 
development software are commercially available at a reasonable cost to any interested party wanting to 
purchase them. The communication controller can be implemented in a radiation-tolerant FPGA or in a 
radiation-hardened ASIC device for deployment in space. The distributed system architecture supported 
by TTP/C allows the use of currently available radiation-hardened processors in the implementation of 
complex control and monitoring functions. Implementation of the protocol in the hardware reduces the 
complexity and cost of software development. The capability to network nodes at distances up to 100 m 
allows components to be placed in confined locations and reduces long runs of bulky wiring bundles by 
placing the nodes close to the system components being monitored and controlled. The wiring connec- 
tions to multiple sensors and actuators can be shortened and only the lightweight twisted pair buses will 
be routed over significant lengths. 

TTCAN is slow at 1 MB/s, but may be useful as a secondary field bus to interface with less 
critical control and monitoring components. SAFEbus is a proprietary implementation and is not com- 
mercially available as components. Despite its high level of realibility and proven track record, the lack 
of commercially available components makes SAFEbus less attractive for an implementation with a 
small development budget. Furthermore, it is a backplane bus that does not support the physical distribu- 
tion of networked nodes. FlexRay could provide the functionality needed, but the associated hardware 
is less mature and is only available to members of the FlexRay consortium. Additionally, FlexRay does 
not implement services such as membership, message status, and consistency. These would have to be 
implemented in the application software. 

While AFDX shows some promise, it does not have inherent fault tolerance and would require 
additional software and hardware implementation to meet the same level of reliability and fault tolerance 
as TTP/C. Furthermore, the event driven nature of the standard has the potential to make it difficult to 
truly implement real-time communication with known latency. Without a bus guardian function, AFDX 
is subject to a faulty node monopolizing a link. SpaceWire suffers similarly, but has the attraction of 
having been deployed in space. A TTP/C implementation over SpaceWire, switched Ethernet, or Fibre 
Channel is possible with some, most likely significant, development cost. All these switched fabrics 
should have the capability to support a time-triggered upper level protocol with some modification to 
the transmission medium. 

Taking all this into account, the choice is to use TTP/C to implement the modular real-time con- 
trol system that the PHIAT team is tasked to develop. This control system architecture has come to be 
known as the integrated safety critical advanced avionics for communication and control (ISAACC) sys- 
tem. Based on this survey, TTP/C provides all the functionality needed to meet the requirements defined 
by the PHIAT team. 
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4. CONCLUSION 


This survey is intended to provide data to aid in the selection of communication architecture for 
future spacecraft avionics systems. It is not an exhaustive survey, but it provides good coverage of the 
communication architectures currently being used or proposed for aircraft and aerospace vehicles. 

The rationale for selection of TTP/C for the IS AACC system is presented. This shows how the 
PHIAT team used the data to select communication architecture suitable to complete the task of imple- 
menting a modular, distributed, and hard real-time control system for manned spacecraft. Other design- 
ers may come to a different conclusion to meet the requirements of the avionics systems they are tasked 
to design. It is the opinion of the PHIAT team members that there will be several different communica- 
tion architectures in manned spacecraft to support integrating the critical functions needed to ensure 
safety with the functions needed for vehicle health monitoring. This is inevitable, as the differing system 
requirements are traded against the real costs of system implementation. The major challenge will be in 
defining what the systems will do and how the systems will be implemented. 
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APPENDIX-COMMUNICATION ARCHITECTURE COMPARISON MATRIX 


Table 1 is a comparison matrix for the features of the following communication architectures 

• SAFEbus 

• TTP/C 

• FlexRay 

• TTCAN 

• IEEE 1394b 

• SpaceWire 

• Ethernet 10/100 Base-T 

• Avionics Full-Duplex Switched Ethernet 

• Fibre Channel 

• Gigabit Ethernet. 


Table 1. Communication architecture comparison matrix. 
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Table 1. Communication architecture comparison matrix (continued). 
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Table 1. Communication architecture comparison matrix (continued). 
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Table 1. Communication architecture comparison matrix (continued). 
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Table 1. Communication architecture comparison matrix (continued). 
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